AITopics | kkt point

1c26c389d60ec419fd24b5fee5b35796-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 13:48:20 GMT

artificial intelligence, machine learning, optimization problem, (18 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Add feedback

The Double-Edged Sword of Implicit Bias: Generalization vs. Robustness in ReLU Networks

Neural Information Processing SystemsApr-25-2026, 13:48:16 GMT

In this work, we study the implications of the implicit bias of gradient flow on generalization and adversarial robustness in ReLU networks. We focus on a setting where the data consists of clusters and the correlations between cluster means are small, and show that in two-layer ReLU networks gradient flow is biased towards solutions that generalize well, but are vulnerable to adversarial examples. Our results hold even in cases where the network is highly overparameterized. Despite the potential for harmful overfitting in such settings, we prove that the implicit bias of gradient flow prevents it. However, the implicit bias also leads to non-robust solutions (susceptible to small adversarial ℓ2-perturbations), even though robust networks that fit the data exist.

artificial intelligence, machine learning, optimization problem, (18 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)

Add feedback

Transformed Low-Rank Parameterization Can Help Robust Generalization for Tensor Neural Networks

Neural Information Processing SystemsApr-24-2026, 13:10:34 GMT

Multi-channel learning has gained significant attention in recent applications, where neural networks with t-product layers (t-NNs) have shown promising performance through novel feature mapping in the transformed domain. However, despite the practical success of t-NNs, the theoretical analysis of their generalization remains unexplored. We address this gap by deriving upper bounds on the generalization error of t-NNs in both standard and adversarial settings. Notably, it reveals that t-NNs compressed with exact transformed low-rank parameterization can achieve tighter adversarial generalization bounds compared to non-compressed models. While exact transformed low-rank weights are rare in practice, the analysis demonstrates that through adversarial training with gradient flow, highly over-parameterized t-NNs with the ReLU activation can be implicitly regularized towards a transformed low-rank parameterization under certain conditions. Moreover, this paper establishes sharp adversarial generalization bounds for t-NNs with approximately transformed low-rank weights. Our analysis highlights the potential of transformed low-rank parameterization in enhancing the robust generalization of t-NNs, offering valuable insights for further research and development.

artificial intelligence, deep learning, machine learning, (18 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.45)

Industry: Energy (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

The Implicit Bias of Adam and Muon on Smooth Homogeneous Neural Networks

Gronich, Eitan, Vardi, Gal

arXiv.org Machine LearningMar-4-2026

We study the implicit bias of momentum-based optimizers on homogeneous models. We first extend existing results on the implicit bias of steepest descent in homogeneous models to normalized steepest descent with an optional learning rate schedule. We then show that for smooth homogeneous models, momentum steepest descent algorithms like Muon (spectral norm), MomentumGD ($\ell_2$ norm), and Signum ($\ell_\infty$ norm) are approximate steepest descent trajectories under a decaying learning rate schedule, proving that these algorithms too have a bias towards KKT points of the corresponding margin maximization problem. We extend the analysis to Adam (without the stability constant), which maximizes the $\ell_\infty$ margin, and to Muon-Signum and Muon-Adam, which maximize a hybrid norm. Our experiments corroborate the theory and show that the identity of the margin maximized depends on the choice of optimizer. Overall, our results extend earlier lines of work on steepest descent in homogeneous models and momentum-based optimizers in linear models.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Machine Learning

2602.1634

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Add feedback

b52e8c6c1a798fed53ac2e6a5e23ddc8-Paper-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 16:45:45 GMT

artificial intelligence, assumption, machine learning, (17 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

a1d20cc72a21ef971d7e49a90d8fa56f-Paper-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 06:03:50 GMT

artificial intelligence, machine learning, optimization problem, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > Canada > Quebec > Montreal (0.04)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Security & Privacy (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Country:

North America > United States > North Carolina > Orange County > Chapel Hill (0.04)
North America > Canada (0.04)
Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

The Double-Edged Sword of Implicit Bias: Generalization vs. Robustness in ReLU Networks

Neural Information Processing SystemsFeb-8-2026, 14:04:50 GMT

In a seminal paper, Szegedy et al. [Sze+14] observed that deep networks are extremely vulnerable to adversarial examples, namely, very small perturbations to the inputs

artificial intelligence, machine learning, optimization problem, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Asia (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Add feedback

Filters

Collaborating Authors

kkt point

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

1c26c389d60ec419fd24b5fee5b35796-Supplemental-Conference.pdf

The Double-Edged Sword of Implicit Bias: Generalization vs. Robustness in ReLU Networks

Transformed Low-Rank Parameterization Can Help Robust Generalization for Tensor Neural Networks

The Implicit Bias of Adam and Muon on Smooth Homogeneous Neural Networks

b52e8c6c1a798fed53ac2e6a5e23ddc8-Paper-Conference.pdf

a1d20cc72a21ef971d7e49a90d8fa56f-Paper-Conference.pdf

f062da1973ac9ac61fc6d44dd7fa309f-Supplemental-Conference.pdf

f062da1973ac9ac61fc6d44dd7fa309f-Paper-Conference.pdf

7f141cf8e7136ce8701dc6636c2a6fe4-Paper.pdf

The Double-Edged Sword of Implicit Bias: Generalization vs. Robustness in ReLU Networks